I/O Optimization and Evaluation for Tertiary Storage Systems

نویسندگان

  • X. Shen
  • A. Choudhary
چکیده

Large-scale parallel scientific applications are generating huge amounts of data that tertiary storage systems emerge as a popular place to hold them. SRB, a uniform interface to various storage systems including tertiary storage systems such as HPSS, UniTree etc., becomes an important and convenient way to access tertiary data across networks in a distributed environment. But SRB is not optimized for I/O performance: one SRB I/O call to storage systems must access a contiguous piece of data like UNIX I/O. For many access patterns, this results in numerous I/O calls which are very expensive. In this paper, we present an Optimization Library (SRB-OL) which is built on top of SRB low level I/O functions and employs various state-of-the-art I/O optimizations that could be found in secondary storage systems such as collective I/O and data sieving etc. We also present a novel optimization scheme: superfile that can efficiently deal with large amounts of small files. We also incorporate a subfile technique and other features in SRB such as container, migrate, stage and purge into our SRB-OL. How to use these optimizations is decided by a Meta-data Management System (MDMS) [8] that resides one level above SRB-OL. The user provides access pattern information/hints through user application to MDMS, and then MDMS uses these hints to choose an optimal I/O approach and passes the decision to SRB-OL. Finally, SRB-OL performs optimized SRB I/O calls to access data residing on tertiary storage systems. This layered design and implementation make it possible to design low level libraries independently and to port the high level layers easily. To give a quantitative view of optimized SRB I/O functions, we propose a performance model based on significant I/O experiments. By using this performance model, we can prove that collective I/O, superfile etc have significant performance improvements. In addition, we present an I/O Performance Evaluator that can estimate I/O cost before the user actually carries out her experiment. This provides the user a lot of benefits for running her application.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Remote I/O Optimization and Evaluation for Tertiary Storage Systems through Storage Resource Broker

Large-scale parallel scientific applications are generating huge amounts of data that tertiary storage systems emerge as a popular place to hold them. SRB, a uniform interface to various storage systems including tertiary storage systems such as HPSS, UniTree etc., becomes an important and convenient way to access tertiary data across networks in a distributed environment. But SRB is not optimi...

متن کامل

Eecient Buuering for Concurrent Disk and Tape I/o

Tertiary storage is becoming increasingly important for many organizations involved in large-scale data analysis and data mining activities. Yet database management systems (DBMS) and other data-intensive systems do not incorporate tertiary storage as a rst-class citizen in the storage hierarchy. For instance, the typical solution for bringing tertiary-resident data under the control of a DBMS ...

متن کامل

Single Query Optimization for Tertiary Memory

We present query execution strategies that are optimized for the characteristics of tertiary memory devices. Traditional query execution methods are oriented to magnetic disk or main memory and perform poorly on tertiary memory. Our methods use ordering and batching techniques on the I/O requests to reduce the media switch cost and seek cost on these devices. Some of our methods are provably op...

متن کامل

Optimization of Fermentation Time for Iranian Black Tea Production

The optimum fermentation times of black tea manufactured by two systems of Orthodox and CTC (cut, tear & curl) were investigated by measuring the quality parameters of black tea, like: theaflavin, thearubigin, highly  polymerized substances and total liquid colour during the fermentation stage. Optimum fermentation times from the beginning of fermentation were determined to be 60 min and 15...

متن کامل

A Java Based Model for I O Scheduling in Tertiary Storage Subsystems

Modern I O subsystems include large storage servers which are con gured to include multiple on line and o line storage media and to deal with a large number of requests with unpredictable access patterns The problem of minimizing the cost of accessing data stored in all media is critical for the performance of the system Given the large storage requirements of modern applications Tertiary Stora...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001